AITopics

Country:

Asia (0.68)
Europe (0.68)
North America > United States > Massachusetts (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsApr-25-2026, 00:31:16 GMT

11715d433f6f8b9106baae0df023deb3-Paper-Conference.pdf

artificial intelligence, dataset, machine learning, (16 more...)

Country: Europe (0.14)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Neural Information Processing SystemsFeb-11-2026, 23:27:05 GMT

42a59a5f35b1b3c3fd648397c88a7164-Paper-Datasets_and_Benchmarks_Track.pdf

artificial intelligence, dataset, machine learning, (16 more...)

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.05)
Europe > United Kingdom > England > Leicestershire > Loughborough (0.05)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
(2 more...)

Genre: Research Report (0.46)

Industry: Automobiles & Trucks (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Neural Information Processing SystemsFeb-11-2026, 04:00:11 GMT

44e3a3115ca26e5127851acd0cedd0d9-Paper-Datasets_and_Benchmarks.pdf

data mining, machine learning, programming language, (20 more...)

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.05)
Asia > Singapore (0.04)
Oceania > New Zealand > North Island > Gisborne District > Gisborne (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)
Workflow (0.68)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science > Data Mining (0.94)
(3 more...)

Neural Information Processing SystemsFeb-9-2026, 16:05:19 GMT

OnInductiveBiasesforHeterogeneousTreatment EffectEstimation

At this point, a range of sophisticated solutions exist which reduce the effect of confounding by balancing the covariate space [3, 4], importance weighting [11, 12, 13, 14] or propensity drop-out [15].

artificial intelligence, machine learning, treatment effect, (18 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsFeb-8-2026, 00:22:29 GMT

11715d433f6f8b9106baae0df023deb3-Paper-Conference.pdf

dataset, misleading cv, noise, (14 more...)

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.05)
North America > Canada (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.93)

Industry: Health & Medicine (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Batsaikhan, Bilguun, Fukuda, Hiroyuki

Predicting Talent Breakout Rate using Twitter and TV data

arXiv.org Artificial IntelligenceNov-24-2025

Early detection of rising talents is of paramount importance in the field of advertising. In this paper, we define a concept of talent breakout and propose a method to detect Japanese talents before their rise to stardom. The main focus of the study is to determine the effectiveness of combining Twitter and TV data on predicting time-dependent changes in social data. Although traditional time-series models are known to be robust in many applications, the success of neural network models in various fields (e.g.\ Natural Language Processing, Computer Vision, Reinforcement Learning) continues to spark an interest in the time-series community to apply new techniques in practice. Therefore, in order to find the best modeling approach, we have experimented with traditional, neural network and ensemble learning methods. We observe that ensemble learning methods outperform traditional and neural network models based on standard regression metrics. However, by utilizing the concept of talent breakout, we are able to assess the true forecasting ability of the models, where neural networks outperform traditional and ensemble learning methods in terms of precision and recall.

artificial intelligence, deep learning, machine learning, (16 more...)

doi: 10.11517/pjsai.JSAI2020.0_1K3ES202

2511.16905

Genre: Research Report > New Finding (0.67)

Industry: Information Technology > Services (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)

Kumar, Ramya, Gulwani, Dhruv, Singh, Sonit

Automated Analysis of Learning Outcomes and Exam Questions Based on Bloom's Taxonomy

arXiv.org Artificial IntelligenceNov-17-2025

This paper explores the automatic classification of exam questions and learning outcomes according to Bloom's Taxonomy. A small dataset of 600 sentences labeled with six cognitive categories - Knowledge, Comprehension, Application, Analysis, Synthesis, and Evaluation - was processed using traditional machine learning (ML) models (Naive Bayes, Logistic Regression, Support Vector Machines), recurrent neural network architectures (LSTM, BiLSTM, GRU, BiGRU), transformer-based models (BERT and RoBERTa), and large language models (OpenAI, Gemini, Ollama, Anthropic). Each model was evaluated under different preprocessing and augmentation strategies (for example, synonym replacement, word embeddings, etc.). Among traditional ML approaches, Support Vector Machines (SVM) with data augmentation achieved the best overall performance, reaching 94 percent accuracy, recall, and F1 scores with minimal overfitting. In contrast, the RNN models and BERT suffered from severe overfitting, while RoBERTa initially overcame it but began to show signs as training progressed. Finally, zero-shot evaluations of large language models (LLMs) indicated that OpenAI and Gemini performed best among the tested LLMs, achieving approximately 0.72-0.73 accuracy and comparable F1 scores. These findings highlight the challenges of training complex deep models on limited data and underscore the value of careful data augmentation and simpler algorithms (such as augmented SVM) for Bloom's Taxonomy classification.

classification, large language model, machine learning, (20 more...)

2511.10903

Country:

Asia (0.46)
Europe (0.46)
Oceania > Australia (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Educational Setting > Online (0.46)
Education > Educational Technology > Educational Software (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)

Pittman, Jason M., Phillips, Anton Jr., Medina-Santos, Yesenia, Stark, Brielle C.

Towards a Method for Synthetic Generation of Persons with Aphasia Transcripts

arXiv.org Artificial IntelligenceOct-31-2025

Towards a Method for Synthetic Generation of Persons with Aphasia Transcripts Jason M. Pittman1, Anton Phillips Jr.2, Yesenia Medina-Santos2, Brielle C. Stark2 1University of Maryland Global Campus 2Indiana University Bloomington, Department of Speech, Language and Hearing Sciences ABSTRACT In aphasia research, Speech-Language Pathologists (SLPs) devote extensive time to manually coding speech samples using Correct Information Units (CIUs), a measure of how informative an individual sample of speech is. Developing automated systems to recognize aphasic language is limited by data scarcity. For example, only about 600 transcripts are available in AphasiaBank yet billions of tokens are used to train large language models (LLMs). In the broader field of machine learning (ML), researchers increasingly turn to synthetic data when such are sparse. Therefore, this study constructs and validates two methods to generate synthetic transcripts of the AphasiaBank Cat Rescue picture description task. One method leverages a procedural programming approach while the second uses Mistral 7b Instruct and Llama 3.1 8b Instruct LLMs. The methods generate transcripts across four severity levels (Mild, Moderate, Severe, Very Severe) through word dropping, filler insertion, and paraphasia substitution. Overall, we found, compared to human-elicited transcripts, Mistral 7b Instruct best captures key aspects of linguistic degradation observed in aphasia, showing realistic directional changes in NDW, word count, and word length amongst the synthetic generation methods. Based on the results, future work should plan to create a larger dataset, fine-tune models for better aphasic representation, and have SLPs assess the realism and usefulness of the synthetic transcripts. Keywords: aphasia, synthetic data, natural language processing, machine learning Introduction Per Nicholas and Brookshire (1993), coding Correct Information Units (CIUs) involves transcribing a connected speech sample verbatim, counting all intelligible words, and then identifying each word that is intelligible, accurate, relevant, and informative about the topic as a CIU--excluding fillers, repetitions, and tangential remarks. From these counts, clinicians calculate the percentage of CIUs and CIUs per minute to quantify communicative informativeness and efficiency.

large language model, machine learning, natural language, (20 more...)

2510.24817

Country: North America > United States > Maryland (0.24)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Artificial IntelligenceSep-9-2025

Exploring approaches to computational representation and classification of user-generated meal logs

Hu, Guanlan, Anand, Adit, Desai, Pooja M., Urteaga, Iñigo, Mamykina, Lena

This study examined the use of machine learning and domain specific enrichment on patient generated health data, in the form of free text meal logs, to classify meals on alignment with different nutritional goals. We used a dataset of over 3000 meal records collected by 114 individuals from a diverse, low income community in a major US city using a mobile app. Registered dietitians provided expert judgement for meal to goal alignment, used as gold standard for evaluation. Using text embeddings, including TFIDF and BERT, and domain specific enrichment information, including ontologies, ingredient parsers, and macronutrient contents as inputs, we evaluated the performance of logistic regression and multilayer perceptron classifiers using accuracy, precision, recall, and F1 score against the gold standard and self assessment. Even without enrichment, ML outperformed self assessments of individuals who logged meals, and the best performing combination of ML classifier with enrichment achieved even higher accuracies. In general, ML classifiers with enrichment of Parsed Ingredients, Food Entities, and Macronutrients information performed well across multiple nutritional goals, but there was variability in the impact of enrichment and classification algorithm on accuracy of classification for different nutritional goals. In conclusion, ML can utilize unstructured free text meal logs and reliably classify whether meals align with specific nutritional goals, exceeding self assessments, especially when incorporating nutrition domain knowledge. Our findings highlight the potential of ML analysis of patient generated health data to support patient centered nutrition guidance in precision healthcare.

artificial intelligence, machine learning, nutritional goal, (17 more...)